p 



TECH 
PAPER 



TP- 2 



November 1982 



<p & J> ^ 



This paper was prepared lor and presented at Wescon"/82 
(c INTEL CORPORATION. 1982 



AFN-00136A 
ORDER NUMBER: 210U9-001 



iAPX 286: A Microsystem for the New Generation 
of Operating System Intensive Applications 



George Alexy 
Product Marketing Manager 
High-End Processors 
Intel 
2625 Walsh Road 
Santa Clara, CA 95051 



Since the introduction of the iAPX 86 
16-bit microsystem 1n 1978, the application 
of microprocessors has quickly evolved from 
the dedicated function markets like 
sem1-1ntell1gent ASCII terminals and simple 
control and instrumentation to more complex 
applications like multl -terminal fixed task 
systems (e.g., word processors), single user 
reprogrammable business systems and process 
control. This evolution has taken advantage 
of the new capabilities and performance of 
the 16-bit micro's to satisfy markets 
needing more capabilities than 8-bit micro's 
and lower costs than mini-based products. 

This thrust into markets previously 
requiring low end minicomputers, has 
developed new requirements for 

microprocessors. As these markets evolve, 
system designers Increasingly require 
minicomputer-like functionality for product 
growth. Migration from single-user, 
dedicated mul ti -function to reprogrammable 
multi-user and multitasking requires 
features like memory protection and memory 
management in addition to higher 
throughput. These features must allow not 
only protection of the O.S. from users but 
also users from each other. Effective 
multi-user system implementations will need 
virtual memory to be part of the overall 
memory management strategy to eliminate 
restrictions on the number of users 
supported by the system. It is essential 
that these features be provided without 
Impacting system throughput or system cost 
(in terms of hardware and software). An 
additional requirement for these markets is 
the ability to migrate application software 
and data bases to new products. 

If we look at the system environment of 
a typical multitasking or multi-user 
reprogrammable system, we can start to 
envision the needs of these systems. The 
system consists of an operating system and 
one or more tasks or users. The most 
obvious need for protection in the system is 
isolation of the operating system and data 
associated with the operating system from 
malicious or accidental corruption by the 
task or user programs. If multiple tasks or 



multiple users are active in the system, it 
Is also necessary to isolate the individual 
tasks or users from each other. While the 
first case Is necessary to guarantee the 
fundamental integrity of the operating 
system and prevents errant tasks or user 
programs from disrupting the system, the 
second case is equally Important if 
individual tasks or users are to truly be 
provided a secure operating environment 
isolated from the side effects of other 
user's erroneous software. As a result, a 
single task or user's environment should 
only be affected by the O.S. or the task or 
user program itself. 

A typical system environment goes beyond 
this simple model. In addition to the O.S. 
and individual tasks and users, the system 
must support sharing of code and data. 
Shared code would include system utilities, 
special application package libraries like 
floating point math, sort/merge etc., and 
runtime Interfaces for languages. Shared 
data must be supported if tasks or users are 
to communicate with each other. The result 
is a requirement to provide full protection, 
yet simultaneously support controlled access 
to common code and data areas without being 
hindered by the protection system. This 
system composition leads to the structure 
shown in Figure 1. 

Figure 1 shows the basic O.S. and task 
or user communications interfaces. The 
users or tasks must be able to access shared 
code and data in addition to O.S. services 
while otherwise being fully protected from 
each other. The O.S. on the other hand 
must be fully protected from erroneous use 
or access by the users or tasks while 
maintaining free and efficient access to 
task or user address space. The later 
aspect is desirable for responding to valid 
user requests of O.S. services like memory 
allocation, message communication and fault 
handling. 

Assuming sufficient intertask and O.S. 
isolation is provided, the integrity of the 
system becomes dependent on the Integrity of 
the operating system itself. For relatively 



1 



simple monitor oriented operating systems, 
proving the integrity of a complete O.S. may 
be feasible; however, as the systems evolve 
into more complete multitasking and 
reprogrammable multi-user systems, 

validating the O.S. integrity becomes an 
incomplete and inexact activity of trial and 
error in an evaluation environment. This 
somewhat questionable technique of product 
integrity validation has lead to the need 
for isolation and protection within the O.S. 
itself. The most critical system-wide O.S. 
procedures and data, like fault handlers, 
memory allocat1ons/de-al location, task 
dispatching and system state information 
must oe protected from the bulk of system 
software associated with activities like 
file management, job control and system 
debuggers. This approach allows the most 
critical system functions to operate 
correctly even if other system software is 
not fully debugged. It also minimizes the 
set of software (often referred to as kernel 
software) that the system absolutely depends 
on and should be provably error free. 

With respect to the model in Figure 1, 
isolation within the O.S. would be portrayed 
as a separation of the O.S. into kernel and 
non-kernel levels where task and user 
software continue to maintain direct 
interfaces to both levels of software. If 
system software is upgraded to accommodate 
new features, the new software is typically 



assigned to the non-kernel level whero bugs 
in the new software, while potentially 
affecting overall system operations, cannot 
cause catastrophic, unreportable errors. 
This can significantly enhance the overall 
integrity of the system. 

While the protection and resulting 
system integrity previously discussed is 
certainly desirable, it cannot be allowed to 
degrade overall system performance. This is 
particularly relevant in multitasking 
systems where the real time response of the 
system (e.g., task suspension, context save 
and lead and initiate of new task) is 
typically the most important system 
parameter. In a protected environment this 
can become a complex operation if protection 
state information must also be swapped as 
part of the task state. It also becomes 
important in multi-user systems to maintain 
reasonable user response time. 

Equally as important in multitasking 
systems 1s basic interrupt response time. 
In many protected systems, a task switch is 
required to invoke the interrupt service 
since the interrupt handler is a system wide 
service not associated with the current user 
or tasks system context. In many instances 
this separation is desirable since the 
interrupt will cause the dispatch or 
scheduling of a service routine that may not 
be time critical. However, for interrupts 




Figure 1. Typical Multi-user, Multitasking System. 



12/4 



2 



that do require immediate attention, the 
basic system task switch time of hundreds of 
micro seconds or even milliseconds will be 
unacceptable long. 

In addition to interrupt response time 
and task or user activation time, the speed 
with which system services can be accessed 
by user software also affects the overall 
system performance. This aspect of 
performance is critical primarily In 
kiu 1 tl-MSer systems where a rich set of 
extended features associated with the 
operating system is heavily accessed by user 
software. In simple non-protected systems, 
access is implemented by simply pushing 
appropriate information onto the stack and 
calling the desired function. In protected 
systems even this simple access technique 
becomes relatively complex. User and system 
software are typically isolated in different 
address spaces to prevent erroneous access 
to system data and software by user 
software. The simple call mechanism is 
replaced by a special supervisor (O.S.) call 
instruction that tells the machine to switch 
address spaces. This forces the user to 
have specific knowledge about system 
services and must differentiate them from 
normal application services with this 
special call Instruction. Even more 
Complex is the parameter passing sequence. 
The caller places parameters on his stack 
which 1s not directly accessable from the 
System software address space. The system 
software must gain access to the user's 
stack and copy the parameters to the system 
stack or continue to directly use the user's 
stack. A side affect of this is the 
potential for user stack errors to cause 
errors during systems software execution. 
Improper Initialization of the stack by the 
user software before transferring control to 
system software can cause stack overflow or 
underflow to occur during system software 
access to the user stack, thereby allowing a 
user er-*or to result in a system error. 
Depending on the system design, errors that 
occur within the O.S. are often considered 
non-recoverable. If this is allowed to 
happen as a result of user error, the 
protection 1n the system will have been 
circumvented and the intent of providing 
protection violated. To compensate for this 
and other similar cases typically requires 
nontMvlal software that increases system 
complexity and reduces system performance 
and reliability. 

The general separation of address spaces 
between user and system also has 
implications on the systems ability to 
access user data areas such as text buffers, 
files and I/O buffers. Without appropriate 
system structure to allow the O.S. access to 



user address space, software must be written 
to translate the tansf erred addresses to a 
form usable by the O.S. Although this 
typically is not critical to system 
performance, it represents additional 
coraptexity when dealing with a protected 
multi-user or multitasking system. 

One last aspect of this next generation 
of microprocessor based systems, 
particularly Important in multi-user 
•ippl ications is virtual memory. These 
systems will need to support multiple 
s imlul taneous users whose individual code 
and data space requirements exceed the 
physical memory of the system. To allow for 
software portability, use of high level 
languages and simplified program 
implementation, the physical memory of the 
system should be transparent to the user and 
not reqire special programming practices 
Hke overlays. Support for virtual memory 
basically requires that all code and data be 
dynamically relocatable and access to code 
or data not in physical memory be detectable 
and recoverable. 

A microprocessor designed to address the 
needs of this next generation of multi-user 
and multitasking systems is Intel Corp. iAPX 
286. The design goals of the processor were 
specifically targeted at providing very 
comprehensive memory management and 
protection while also significantly 
enhancing the performance of these systems. 

We can start to understand the 
mechanisms of the iAPX 286 and how they 
accommodate the needs of these systems by 
first looking at how the user views his 
virtual address space. Figure 2 shows that 
the task or user has a one gigabyte virtual 
address space partitioned into a half 
gigabyte of global and half gigabyte of 
local address space. 



UPPER 
1/2 BILLION 
BYTES 



LOWER 
1/2 BILLION 
BYTES 




Figure 2. iAPX 286 User Virtual 
Address Space. 



3 



12/4 



The global space is a virtual address 
space common to and shared by all users or 
tasks in the system. It typically contains 
the operating system or interfaces to 
operating system functions that are separate 
tasks. This typically includes system wide 
O.S. software like the kernel, real time 
Interrupt handles, O.S. services available 
to us^rs or tasks, and shared code and data 
like procedure libraries and task 
commuri ication buffers. 

The local space is totally private to 
the user or task and contains application 
code ^nd data. From the users perspective, 
the global virtual address space makes the 
O.S. and all shared code and data a part of 
his address space. From the operating 
system perspective the user's local private 
address space is also a direct part of the 
space allowing the O.S. quick 
to the user's data. The O.S. 
perspective to its advantage, 
the local address space is 
private to the task, the O.S. can maintain 
O.S. related task specific data tables like 
memory allocation maps, I/O allocations, 
etc., within the local address space. This 
become; particularly advantageous when 
swapping users in and out of a multi-user 
all task or user specific 
contained within the user or 
address space. 



O.S. address 
direct access 
can use this 
Knowing that 



system since 
information is 
task's private 



Figure 3 shows how this mechanism 
applies to a system with multiple users or 
multiple tasks. Each task or user has a 
separate and private local virtual address 
space inaccessible from any other user's 
local virtual address space. At any 
instance in time, one user or task is 
executing. During that instance, the 
virtual address space of the system consists 
of the global address space and that user or 



task's private local address space. This 
technique allows all users to have private 
memory fully protected from all other users 
or tasks while sharing a common address 
space containing the O.S. and shared code 
and data. Making the O.S. part of the 
user's address space makes access to system 
procedures simple and straightforward. Only 
a standard call instruction is necessary for 
the user to transfer control to an O.S. 
service. Since the user address space is 
likewise within the O.S. address space, the 
O.S. has direct and simple access to user 
data areas, I/O buffers, etc. Figure 4 
demonstrates how the mechanisms of the iAPX 
286 would be applied to the typical system 
environment discussed at the beginning of 
this paper. 

The memory model also satisfies the need 
for protection of users from each other and 
dynamic relocatabi 1 i ty of code and data in 
addition to a clean user/0. S. interface. 
The actual mapping of global and local 
virtual address spaces to physical memory is 
shown in Figure 5. The keys to address 
mapping in the iAPX 286 are descriptor 
tables. The global address space and each 
local address space are described by 
separate descriptor tables. The upper 13 
bits of a 32 bit virtual address are used as 
an index into either the local or global 
descriptor table. One bit of the virtual 
address also selects either the global or 
local virtual address spaces. The selected 
descriptor table entry contains the base 
address of the requested code or data, a 
limit value and access rights. The final 
physical address is formed by adding the 16 
bit offset value of the virtual address to 
the base address contained in the 
descriptor. This process is shown in Figure 
6. Since descriptor tables are memory 
resident, every time a descriptor is 



I 
l 
I 
I 
l 
I 

1 



GLOBAL 
SHARED 
ADDRESS 
SPACE 



1 

l 















TASK 1 


I 


TASK 2 




TASK N 




LOCAL 


1 


LOCAL 




LOCAL 




PRIVATE 


1 


PRIVATE 


• * • 


PRIVATE 




ADDRESS 


1 


ADDRESS 




ADDRESS 




SPACE 


1 
1 


SPACE 




SPACE 



VIRTUAL ADDRESS 
SPACE WHILE 

TASK 1 IS 
EXECUTING 



Figure 3. 



Conceptual View of Multi-user or Multitasking System 
Virtual Address Spaces. 



12/4 



4 



TASK 1 
1/2 BILLION BYTE 
PRIVATE VIRTUAL 
ADDRESS SPACE 




Figure 4. iAPX 286 Applied to Multi-user, Multitasking 
System Needs. 



accessed, 1t is automatically cached into a 
chip descriptor cache where it is held for 
future use. This prevents subsequent access 
to the same area of code or data from 
requiring access to the memory based 
Information. In addition to the base and 
11m1t fields, the descriptor also contains a 
present bit and accessed bit for virtual 
memory swapping and access rights bits which 
define usage rights like read only, execute 
only, etc. 



GLOBAL 


r 


ADDRESS 




SPACE 





USER 1 
LOCAL 



3 







GOT 






GOT 




LDT 1 




LOT 1 


LOT 






LDT 1 


• Yv 




LDT H 


GOT 




LOT I 
N 




LDT N 



18 MEGABYTE 
PHYSICAL 
MEMORY 



The global and currently accessable 
local descriptor table locations in memory 
are defined by special registers internal to 
the iAPX 286. Since the global address 
space remains constant, this register is 
typically only loaded as part of system 
initialization. The register pointing to 
the currently active local descriptor table 
is reloaded with the address of the new 
table every time the system switches to a 
new task or user. This register is not 
accessable to the task or user and 
guarantees that during execution, the task 
or user's address space is uniquely defined 
by the global descriptor table and user or 
task's own local descriptor table. 
Additional checks like index limit checks to 
prevent accessing a descriptor beyond the 
end of the descriptor table and offset limit 
checks to prevent accessing outside the area 
of memory defined by a descriptor guarantee 
total isolation of a user's or task's 
address space from other users or tasks. 



The advantages of the system software 
and user software sharing a common virtual 
address space are many. The O.S. has clean 
and simple access to user data areas, the 
user maintains a simple call interface to 
the O.S. and access to interrupt service 
routines is direct and fast. 



Figure 5. Virtual to Physical 
Address Translation. 



5 



12/4 



Conceptual 



LOGICAL ADDRESS 



TARGET 
SEGMENT 



SELECTOR 



OFFSET 



REAL 
ADDRESS 



■o 



DESCRIPTOR 
T TABLE 



1 



SEGMENT 
DESCRIPTOR 



SEGMENT 
BASE 



DATA 



Figure 6. iAPX- 286 Virtual Address Translation. 



By maintaining high frequency and time 
critical interrupt service rojtines in the 
global address space, thpy are always 
immediately accessable regardless of which 
user or task is currently executing. In the 
iAPX 286, this results In an interrupt 
response time of under four microseconds 
with an eight megahertz CPU with full memory 
management and protection. 

Virtual memory is supported 1n the 
system implementation via a combination of 
CPU facilities and software. The basic 
facilities provided by the iAPX 286 include 
the present bit of the descriptor which 
automatically forces an exception interrupt 
If a not-present segment of the user's 
virtual address space Is accessed, an 
accessed bit for usage profiling by software 
implementing the swapping policy, and full 
restartability of all instructions which use 
full virtual addresses. These facilities 
provide the required capabilities for 
development of a full virtual memory system 
without imposing restrictions or 
requirements on the system implementation. 

Whi e the points discussed above address 
the nee Is of user isolation, virtual memory 
and e ficient user/0. S. interface, a 
subsequt nt dimension of protection is 
necessa y to provide O.S./user isolation 



even though at any instance, the O.S. and 
currently active user or task share the same 
virtual address space. This need 1s 
illustrated in Figure 7 where task or user 
code or shared code like a library procedure 
attempt to access O.S. routines that should 
be restricted from access outside the O.S. 

To accommodate the need for this inner 
task protection, the iAPX 286 provides a 
four level hierarchy of trust within each 
task. This four level mechanism effectively 
overlays the virtual address space to 
provide protection within the address 
space. Each area of code and data Is 
assigned to one of four privilege levels 
which controls the ability of code to access 
other code and data and the accessibility of 
data by code. The inner most level as shown 
in Figure 8 1s the most trusted code which 
can access any data within the global and 
local address spaces and is most protected 
from access by software at less privileged 
levels. The protection Implies restricted 
manipulation of data and restricted or 
controlled transfer of control to code at a 
more privileged level. Subsequent levels in 
the hierarchy are protected from access by 
software at lower levels but not from access 
from more trusted levels. At any instant, 
the active task or user is executing within 
one of the four privilege levels. At that 



12/4 



Required to Protect Code and Oat i From Corruption 
By Less Reliable Software in the Same Virtual Address Space 



ILLEGAL 
OPERATION 




ILLEGAL 
OPERATION 



Figure 7. Inner Task Protection. 



instance, the program may access data only 
at the same or a less privileged level. The 
program may transfer control anywhere 
directly within the same privilege level or 
to procedures at more privileged levels via 
a programmer transparent special control 
mechanism called a gate. A program cannot 



request service of (transfer control to) a 
procedure at a less privileged level since 
that procedure is considered less trusted 
(of questionable integrity) and may not be 
able to access data associated with the 
calling procedure at a higher level. 



- A Hierarchy of Trust - 



TASK C 



GATED 
CALL 
AND RETURN 



TASK A 




UNRESTRICTED 
LOCAL 
ACCESS 



Figure 8. i APX 286 Inner Task Protection Implementation. 

7 



12/4 



This mechanism allows the system 
designer to partition the software and data 
structures among the various privilege 
levels for maximum system integrity. 
Privilege level 0, the most trusted level is 
typically reserved for the system kernel, 
level 1 for the O.S. services, level 2 for 
library procedures and custom O.S. 
extensions while level 3 is typically 
assigned for task and user application 
programs and data. 

An example of data access control is 
shown in Figure 9. Code executing at 
privilege level two can access data at 
levels two and three but not at levels one 
and zero. An example of code access control 
is shown in Figure 10. Here the code at 
level two can use gates at levels two and 
three to transfer control to code at higher 
levels but cannot use gates at higher 
levels or directly transfer control to code 
at mere privileged levels. 




Figure 10. Code Access Control. 



_PRIV1L£QE_ 
LEVELS 

LEAST MOST 
TRUSTED TRUSTED 
3 2 10 




Figure 9. Data Access Control. 



The basic operation of the gate 
mechanism is shown in Figure IT. The gate 
is contained in the descriptor table like a 
code segment, but is an additional level of 
indirection between the calling program and 
the target destination. If the virtual 
address specified in the call selects a 
gate, the iAPX 286 checks that the gate is 
at the same or less privileged level than 
the caller. If it is, a new virtual address 
contained within the gate is used to specify 
the actual descriptor of the target code 
segment and the entry point to the target 
code segments. The target code segment can 
be at the same or more privileged level, but 
not less privileged. 

This mechanism controls the visibility 
of more privileged code from less privileged 
code. If the caller were to directly 
specify the descriptor of the more 
privileged routine, a protection check would 
be invoked. Therefore, the code is only 
accessable to the caller through the gate. 
If a gate does not exist at the callers 
privilege level or a less trusted privilege 
level, the caller cannot transfer control to 
the code. However, gates may exist at more 
privileged levels so more privileged code 
(for example at level 1) could access the 
code (for example at level 0). 



12/4 



PROGRAM 
VISIBLE 



SELECTOR 



GATE 



SEQ 
OESCR 



CPU BASED 
CS SEGMENT 
DESCRIPTOR 

I 1 

-I h- 
I I 



ENTER: 



DESCRIPTOR 



TARGET 
CODE 



Figure 11. Control Transfer Via Gates. 



In addition to controlling access to 
more privileged code, gates also serve 
another important functon. To prevent the 
user or task software from disrupting the 
stack for more privileged software, each 
privilege level is assigned a unique and 
separate stack. Therefore, when less 
privileged code calls more trusted software, 
it pu-.hes parameters onto its stack and 
calls via the gate. 

To prevent more privileged software from 
requiring access to the caller's stack, the 
gate contains a count of the number of 
parameters to be copied from the caller's 
stack to the called procedure's stack. The 
copy process is automatically performed by 
the iAPX 286 during the call. The resulting 
stack image seen by the called procedure is 
the same as the calling procedure. If a 
stack error occurs, it effectively occurs in 
the user's privilege level and not 1n the 
O.S. This capability significantly Improves 
system call performance and simplifies the 
system software by eliminating the need to 
access stacks at various levels with 
defensive software. 

An associated protection capability in 
the iAPX 286 is the support for verification 
of address parameters passed by the caller. 



Special instruction in the CPU allow the 
called procedure to restrict all addresses 
passed on the stack to the privilege level 
of the caller. This prevents an errant 
program from passing an address to the 
O.S.'s own data tables and structures to an 
O.S. procedure for manipulation and possible 
distraction on behalf of a user or task 
program. Although all addresses can be 
restricted, an invalid address will not 
cause an exception fault unless an attempt 
is made to use it to allow for dummy 
variables. 

Also of critical importance to 
multi-user and multitasking systems in a 
protection environment is the system 
performance in task switching. To address 
this need, the iAPX 286 provides a fully 
integrated task switch capability. An 
extension of the gate mechanism, the 
facility allows a program to call or jump to 
(invoke) another task without operating 
system intervention. When used, the 
mechanism automatically saves the entire 
state of the current task, loads the state 
of the new task and transfers control to 
that task. This includes switching to the 
new users virtual address space and 
protection attributes. The entire operation 
requires only 22 microseconds with an eight 



12/4 



megahertz CPU. Back linkages are also 
automatically maintained for nested task 
invocations. 

Figure 12 demonstrates how the iAPX 286 
privilege levels and gates might be applied 



to completely satisfy the memory management 
and protection requirements of a multi-user 
and multitasking system. The result is a 
flexible and comprehensive system 
environment without sacrificing the full 
system performance. 








TASK 7 




LOT 




1 


1 PWVATE COM 
V AND DATA 






LI 



ALL OTHER INTER-LEVEL ACCESSES OR CONTROLTRANSFERS 
AUTOMATICALLY PROHIBITED 



Figure 12. iAPX 286 Applied To Multi-user, Multitasking 
System Needs. 



12/4 



10 



